skip to main content
10.1145/3404835.3462942acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Answer Complex Questions: Path Ranker Is All You Need

Published:11 July 2021Publication History

ABSTRACT

Currently, the most popular method for open-domain Question Answering (QA) adopts "Retriever and Reader" pipeline, where the retriever extracts a list of candidate documents from a large set of documents followed by a ranker to rank the most relevant documents and the reader extracts answer from the candidates. Existing studies take the greedy strategy in the sense that they only use samples for ranking at the current hop, and ignore the global information across the whole documents. In this paper, we propose a purely rank-based framework Thinking Path Re-Ranker (TPRR), which is comprised of Thinking Path Ranker (TPR) for generating document sequences called "a path" and External Path Reranker (EPR) for selecting the best path from candidate paths generated by TPR. Specifically, TPR leverages the scores of a dense model and conditional probabilities to score the full paths. Moreover, to further enhance the performance of the dense ranker in the iterative training, we propose a "thinking" negatives selection method that the top-K candidates treated as negatives in the current hop are adjusted dynamically through supervised signals. After achieving multiple supporting paths through TPR, the EPR component which integrates several fine-grained training tasks for QA is used to select the best path for answer extraction. We have tested our proposed solution on the multi-hop dataset "HotpotQA" with a full wiki set ting, and the results show that TPRR significantly outperforms the existing state-of-the-art models. Moreover, our method has won the first place in the HotpotQA official leaderboard since Feb 1, 2021 under the Fullwiki setting. Code is available at https://gitee.com/mindspore/mindspore/ tree/master/model_zoo/research/nlp/tprr.

Skip Supplemental Material Section

Supplemental Material

SIGIR21-fp0790.mp4

mp4

125.5 MB

References

  1. Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, and Caiming Xiong. 2020. Learning to retrieve reasoning paths over wikipedia graph for question answering. In 8th International Conference on Learning Representations.Google ScholarGoogle Scholar
  2. Yang Bai, Xiaoguang Li, Gang Wang, Chaoliang Zhang, Lifeng Shang, Jun Xu, Zhaowei Wang, Fangshan Wang, and Qun Liu. 2020. SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval. CoRR, Vol. abs/2010.00768 (2020).Google ScholarGoogle Scholar
  3. Wei-Cheng Chang, Felix X Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2020. Pre-training tasks for embedding-based large-scale retrieval. arXiv preprint arXiv:2002.03932 (2020).Google ScholarGoogle Scholar
  4. Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1870--1879.Google ScholarGoogle ScholarCross RefCross Ref
  5. Rajarshi Das, Ameya Godbole, Dilip Kavarthapu, Zhiyu Gong, Abhishek Singhal, Mo Yu, Xiaoxiao Guo, Tian Gao, Hamed Zamani, Manzil Zaheer, et al. 2019. Multi-step entity-centric information retrieval for multi-hop question answering. In Proceedings of the 2nd Workshop on Machine Reading for Question Answering. 113--118.Google ScholarGoogle ScholarCross RefCross Ref
  6. Ming Ding, Chang Zhou, Qibin Chen, Hongxia Yang, and Jie Tang. 2019. Cognitive Graph for Multi-Hop Reading Comprehension at Scale. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2694--2703.Google ScholarGoogle ScholarCross RefCross Ref
  7. Yuwei Fang, Siqi Sun, Zhe Gan, Rohit Pillai, Shuohang Wang, and Jingjing Liu. 2019. Hierarchical graph network for multi-hop question answering. arXiv preprint arXiv:1911.03631 (2019).Google ScholarGoogle Scholar
  8. Markus Freitag and Yaser Al-Onaizan. 2017. Beam search strategies for neural machine translation. arXiv preprint arXiv:1702.01806 (2017).Google ScholarGoogle Scholar
  9. Luyu Gao, Zhuyun Dai, Zhen Fan, and Jamie Callan. 2020. Complementing Lexical Retrieval with Semantic Residual Embedding. CoRR, Vol. abs/2004.13969 (2020).Google ScholarGoogle Scholar
  10. Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. Realm: Retrieval-augmented language model pre-training. arXiv preprint arXiv:2002.08909 (2020).Google ScholarGoogle Scholar
  11. Phu Mon Htut, Samuel R Bowman, and Kyunghyun Cho. 2018. Training a ranking function for open-domain question answering. arXiv preprint arXiv:1804.04264 (2018).Google ScholarGoogle Scholar
  12. Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2017. Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734 (2017).Google ScholarGoogle Scholar
  13. Vladimir Karpukhin, Barlas Oug uz, Sewon Min, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020).Google ScholarGoogle Scholar
  14. Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 39--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut. 2019. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. (2019).Google ScholarGoogle Scholar
  16. Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent retrieval for weakly supervised open domain question answering. arXiv preprint arXiv:1906.00300 (2019).Google ScholarGoogle Scholar
  17. Shaobo Li, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Chengjie Sun, Zhenzhou Ji, and Bingquan Liu. 2021. HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions. In AAAI.Google ScholarGoogle Scholar
  18. Yi Luan, Jacob Eisenstein, Kristina Toutanova, and Michael Collins. 2020. Sparse, dense, and attentional representations for text retrieval. arXiv preprint arXiv:2005.00181 (2020).Google ScholarGoogle Scholar
  19. Sewon Min, Victor Zhong, Luke Zettlemoyer, and Hannaneh Hajishirzi. 2019. Multi-hop Reading Comprehension through Question Decomposition and Rescoring. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 6097--6109.Google ScholarGoogle ScholarCross RefCross Ref
  20. Jianmo Ni, Chenguang Zhu, Weizhu Chen, and Julian McAuley. 2019. Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (ACL). 335--344.Google ScholarGoogle ScholarCross RefCross Ref
  21. Yixin Nie, Songhe Wang, and Mohit Bansal. 2019. Revealing the Importance of Semantic Retrieval for Machine Reading at Scale. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2553--2566.Google ScholarGoogle ScholarCross RefCross Ref
  22. Peng Qi, Haejun Lee, Oghenetegiri Sido, Christopher D Manning, et al. 2020. Retrieve, rerank, read, then iterate: Answering open-domain questions of arbitrary complexity from text. arXiv preprint arXiv:2010.12527 (2020).Google ScholarGoogle Scholar
  23. Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, and Christopher D. Manning. 2019. Answering Complex Open-domain Questions Through Iterative Query Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2590--2602.Google ScholarGoogle Scholar
  24. Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2020. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering. CoRR, Vol. abs/2010.08191 (2020).Google ScholarGoogle Scholar
  25. Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).Google ScholarGoogle Scholar
  26. Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwikj. 2021. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  27. Wenhan Xiong, Xiang Lorraine Li, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, et al. 2020. Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval. arXiv preprint arXiv:2009.12756 (2020).Google ScholarGoogle Scholar
  28. Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. 2019. End-to-end open-domain question answering with bertserini. arXiv preprint arXiv:1902.01718 (2019).Google ScholarGoogle Scholar
  29. Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2369--2380.Google ScholarGoogle Scholar
  30. Yuyu Zhang, Ping Nie, Arun Ramamurthy, and Le Song. 2020. DDRQA: Dynamic Document Reranking for Open-domain Multi-hop Question Answering. arXiv preprint arXiv:2009.07465 (2020).Google ScholarGoogle Scholar
  31. Chen Zhao, Chenyan Xiong, Corby Rosset, Xia Song, Paul Bennett, and Saurabh Tiwary. 2019. Transformer-xh: Multi-evidence reasoning with extra hop attention. In International Conference on Learning Representations.Google ScholarGoogle Scholar

Index Terms

  1. Answer Complex Questions: Path Ranker Is All You Need

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
      July 2021
      2998 pages
      ISBN:9781450380379
      DOI:10.1145/3404835

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 July 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader